Paper
27 February 2007 Statistically-constrained shallow text marking: techniques, evaluation paradigm and results
Author Affiliations +
Abstract
We present three natural language marking strategies based on fast and reliable shallow parsing techniques, and on widely available lexical resources: lexical substitution, adjective conjunction swaps, and relativiser switching. We test these techniques on a random sample of the British National Corpus. Individual candidate marks are checked for goodness of structural and semantic fit, using both lexical resources, and the web as a corpus. A representative sample of marks is given to 25 human judges to evaluate for acceptability and preservation of meaning. This establishes a correlation between corpus based felicity measures and perceived quality, and makes qualified predictions. Grammatical acceptability correlates with our automatic measure strongly (Pearson's r = 0.795, p = 0.001), allowing us to account for about two thirds of variability in human judgements. A moderate but statistically insignificant (Pearson's r = 0.422, p = 0.356) correlation is found with judgements of meaning preservation, indicating that the contextual window of five content words used for our automatic measure may need to be extended.
© (2007) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Brian Murphy and Carl Vogel "Statistically-constrained shallow text marking: techniques, evaluation paradigm and results", Proc. SPIE 6505, Security, Steganography, and Watermarking of Multimedia Contents IX, 65050Z (27 February 2007); https://doi.org/10.1117/12.713355
Lens.org Logo
CITATIONS
Cited by 14 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transform theory

Digital watermarking

Switches

Data hiding

Data modeling

Head

Lead

Back to Top